Search CORE

258 research outputs found

QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment

Author: Camille Coti
Camille Coti
Camille Coti
Emmanuel Agullo
Emmanuel Agullo
Emmanuel Agullo
Jack Dongarra
Jack Dongarra
Jack Dongarra
Julien Langou
Julien Langou
Qr Fac
Thomas Herault
Thomas Herault
Thomas Herault
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/12/2009
Field of study

Previous studies have reported that common dense linear algebra operations do not achieve speed up by using multiple geographical sites of a computational grid. Because such operations are the building blocks of most scientific applications, conventional supercomputers are still strongly predominant in high-performance computing and the use of grids for speeding up large-scale scientific problems is limited to applications exhibiting parallelism at a higher level. We have identified two performance bottlenecks in the distributed memory algorithms implemented in ScaLAPACK, a state-of-the-art dense linear algebra library. First, because ScaLAPACK assumes a homogeneous communication network, the implementations of ScaLAPACK algorithms lack locality in their communication pattern. Second, the number of messages sent in the ScaLAPACK algorithms is significantly greater than other algorithms that trade flops for communication. In this paper, we present a new approach for computing a QR factorization -- one of the main dense linear algebra kernels -- of tall and skinny matrices in a grid computing environment that overcomes these two bottlenecks. Our contribution is to articulate a recently proposed algorithm (Communication Avoiding QR) with a topology-aware middleware (QCG-OMPI) in order to confine intensive communications (ScaLAPACK calls) within the different geographical sites. An experimental study conducted on the Grid'5000 platform shows that the resulting performance increases linearly with the number of geographical sites on large-scale problems (and is in particular consistently higher than ScaLAPACK's).Comment: Accepted at IPDPS10. (IEEE International Parallel & Distributed Processing Symposium 2010 in Atlanta, GA, USA.

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Instability of precession driven Kelvin modes: Evidence of a detuning effect

Author: Giesecke Andre
Gundrum Thomas
Herault Johann
Stefani Frank
Publication venue: 'American Physical Society (APS)'
Publication date: 11/02/2019
Field of study

We report an experimental study of the instability of a nearly-resonant Kelvin mode forced by precession in a cylindrical vessel. The instability is detected above a critical precession ratio via the appearance of peaks in the temporal power spectrum of pressure fluctuations measured at the end-walls of the cylinder. The corresponding frequencies can be grouped into frequency sets satisfying resonance conditions with the forced Kelvin mode. We show that one triad is associated with a parametric resonance of Kelvin modes. For the first time, we observe a significant frequency variation of the unstable modes with the precession ratio. We explain this frequency modification by considering a detuning mechanism due to the slowdown of the background flow. By introducing a semi-analytical model, we show that the departure of the flow from the solid body rotation leads to a modification of the dispersion relation of Kelvin modes and to a detuning of the resonance condition. Our calculations reproduce the features of experimental measurements. We also show that a second frequency set, including one very low frequency as observed in the experiment, does not exhibit the properties of a parametric resonance between Kelvin modes. Our observations suggest that it may correspond to the instability of a geostrophic mode.Comment: 26 pages, 17 figures, accepted by Phys. Rev. Fluid

arXiv.org e-Print Archive

MPI Applications on Grids: A Topology-Aware Approach

Author: Cappello Franck
Coti Camille
Herault Thomas
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

Large Grids are build by aggregating smaller parallel machines through a public long-distance interconnection network (such as the Internet). Therefore, their structure is intrinsically hierarchical. Each level of the network hierarchy gives performances which differ from the other levels in terms of latency and bandwidth. MPI is the de facto standard for programming parallel machines, therefore an attractive solution for programming parallel applications on this kind of grids. However, because of the aforementioned differences of communication performances, the application continuously communicates back and forth between clusters, with a significant impact on performances. In this report, we present an extension of the information provided by the run-time environment of an MPI library, a set of efficient collective operations for grids and a methodology to organize communication patterns within applications with respect to the underlying physical topology, and implement it in a geophysics application

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

MPI Applications on Grids: A Topology-Aware Approach

Author: Coti Camille
Herault Thomas
Cappello Franck
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

HAL - Lille 3

Crossref

Distribution, Approximation and Probabilistic Model Checking

Author: Guirado Guillaume
Herault Thomas
Lassaigne Richard
Peyronnet Sylvain
Publication venue: Elsevier B.V.
Publication date: 20/02/2006
Field of study

AbstractAPMC is a model checker dedicated to the quantitative verification of fully probabilistic systems against LTL formulas. Using a Monte-Carlo method in order to efficiently approximate the verification of probabilistic specifications, it could be used naturally in a distributed framework. We present here the tool and its distribution scheme, together with extensive performance evaluation, showing the scalability of the method, even on clusters containing 500+ heterogeneous workstations

Elsevier - Publisher Connector

Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant {MPI}

Author: Aurelien Bouteiller
Franck Cappello
Geraud Krawezik
Pierre Lemarinier
Thomas Herault
Publication venue: HAL CCSD
Publication date: 03/04/2005
Field of study

International audienceFault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault detection and handling. For this last approach, several protocols have been proposed in the literature. In a recent paper, we have demonstrated that uncoordinated checkpointing tolerates higher fault frequency than coordinated checkpointing. Moreover causal message logging protocols have been proved the most efficient message logging technique. These protocols consist in piggybacking non deterministic events to computation message. Several protocols have been proposed in the literature. Their merits are usually evaluated from four metrics: a) piggybacking computation cost, b) piggyback size, c) applications performance and d) fault recovery performance. In this paper, we investigate the benefit of using a stable storage for logging message events in causal message logging protocols. To evaluate the advantage of this technique we implemented three protocols: 1) a classical causal message protocol proposed in Manetho, 2) a state of the art protocol known as LogOn, 3) a light computation cost protocol called Vcausal. We demonstrate a major impact of this stable storage for the three protocols, on the four criteria for micro benchmarks as well as for the NAS benchmark

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy

Author: Bosilca George
Bouteiller Aurelien
Dongarra Jack
Du Peng
Herault Thomas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

The University of Manchester - Institutional Repository

Multi-criteria checkpointing strategies: response-time versus resource utilization

Author: Bouteiller Aurélien
Cappello Franck
Dongarra Jack
Guermouche Amina
Herault Thomas
Robert Yves
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

International audienceFailures are increasingly threatening the efficiency of HPC systems, and current projections of Exascale platforms indicate that rollback recovery, the most convenient method for providing fault tolerance to general-purpose applications, reaches its own limits at such scales. One of the reasons explaining this unnerving situation comes from the focus that has been given to per-application completion time, rather than to platform efficiency. In this paper, we discuss the case of uncoordinated rollback recovery where the idle time spent waiting recovering processors is used to progress a different, independent application from the system batch queue. We then propose an extended model of uncoordinated checkpointing that can discriminate between idle time and wasted computation. We instantiate this model in a simulator to demonstrate that, with this strategy, uncoordinated checkpointing per application completion time is unchanged, while it delivers near-perfect platform efficiency.Voir le résumé en anglais

HAL-ENS-LYON

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

Hal-Diderot

A Multithreaded Communication Substrate for OpenSHMEM

Author: Aurelien Bouteiller
George Bosilca
Thomas Herault
Publication venue
Publication date: 02/04/2020
Field of study

ABSTRACT OpenSHMEM scalability is strongly dependent on the capability of its communication layer to efficiently handle multiple threads. In this paper, we present an early evaluation of the thread safety specification in the Unified Common Communication Substrate (UCCS) employed in OpenSHMEM. Results demonstrate that thread safety can be provided at an acceptable cost and can improve efficiency for some operations, compared to serializing communication

CiteSeerX